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SPECIFICATION 

Hierarchical file system 

5 BACKGROUND OF THE INVENTION. 
Field of the Invention 

The present invention relates to the method of 
storing and retrieving data using a computer, 
and more specifically to a hierarchical filing 
10 system. 

Prior Art 

In a computer system, information is typi- 
cally stored as signals on various storage me- 

15 diums, such as magnetic tapes, disks, semi- 
conductor devices, etc. As storage densities 
increased with advances in storage device 
technology, it became possible for a device to 
store much more information than previously. 

20 When information is stored on a device, it 
is cataloged so that the same information is 
later retrieved when desired. Normally, a uni- 
que code name is attributed to a particular 
body of data to differentiate it from others. 

25 To retrieve a desired body of data, an appro- 
priate code name associated with that data is 
used, wherein the device searches for that 
code name and retrieves the desired data 
when that code name is found. 

30 It is well-known in the prior art that each 
separate body of data is termed a file and the 
cataloging of these files on a device is termed 
filing. Typically, code names associated with 
particular data contain pointers which point to 

35 areas in memory reserved for mass storage. 
The various code names and their pointers 
comprise the cataloging system. When high- 
density storage devices are used, millions of 
bits of information are capable of being stored 

40 on such a device, which permits hundreds, 
thousands, and even millions of files to be 
created. To search through these files in a 
serial fashion to look for a specific file is time- 
consuming. 

45 It is appreciated that what is needed is a 
filing system for a high-density storage me- 
dium which rapidly searches and retrieves the 
desired file stored. Further, with the advent of 
the personal computer (PC) and the small busi- 

50 ness computer, where physical size is a con- 
cern, it is desirable to have a filing system 
which may be implemented in a lesser line of 
program, yet be effectual. 

55 SUMMARY 

A method for providing a hierarchical filing 
system is described. The hierarchical filing 
system provides a catalog of the data stored 
in various locations within a memory device. 

60 Typically, one cataloging structure is used to 
organize a volume of memory. 

The cataloging structure of the hiearchical 
filing system is provided by an upside-down 
tree type structure wherein there is a starting 



Other directories and files emanate as off- 
spring. A plurality of descendant levels branch 
downward to provide the hierarchical structure 
of the catalog. The cataloging structure con- 
70 tains the location information of where the ac- 
tual data is stored. 

The file cataloging system is implemented 
using a B-Tree. The cataloging information is 
kept in the leaf nodes of the B-Tree. The non- 
75 leaf nodes (index nodes) of the B-Tree contain 
information that allows searching for particular 
catalog information by using the code name or 
key of the corresponding file. Key values, 
which are used to identify and catalog various 
80 files in the cataloging system, are also used 
to organize the catalog in the leaf nodes of 
the B-Tree. The keys are placed in an ascend- 
ing order for systematic access. Further, the 
B_T r ee grows by using left rotates and left 
85 splits with insertion of catalog information 
about new files from the right to maintain a 
balanced tree. 

When a file's data is stored, additions, dele- 
tions and modifications will typically result in 
90 non-contiguous physical storage of the data in 
the memory device. Each of the contiguous 
segments of the file is known as a file extent. 
A record of the physical location of the ex- 
tents for a particular file is maintained in one 
95 or more extents records. The hierarchical filing 
system uses a file extents list to maintain the 
extents records of the various files on the 
memory device. 
The present invention maintains the first ex- 
100 tents record of a file in the cataloging struc- 
ture, but any further extents records are main- 
tained in a separate file extents list. This file 
extents list is also implemented in a second 
B-Tree structure. 

105 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a representation of a prior art 
flat filing system. 

Figure 2 is a representation of a hierarchical 
110 filing system of the present invention. 

Figure 3 is a representation of a B-Tree 
structure of the present invention. 

Figure 4 is a representation of contents of a 
node for the B-Tree structure of Figure 3. 
115 Figure 5 is a representation of a left-split 
and a left-rotate operation of a B-Tree struc- 
ture of the preferred embodiment. 

Figure 6 is a representation of a cataloging 
structure of the preferred embodiment and an 
120 organization of the cataloging structure in vari- 
ous nodes of a B-Tree. 

Figure 7 is a representation of a volume 
allocation mapping in a filing system of the 
preferred embodiment. 
125 Figure 8 is a representation of a file extents 
list of the preferred embodiment and showing 
various file extents in memory. 

Figure 9 is a representation showing the file 
extents organization in the Catalog and Ex- 
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DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 
The present invention describes a method 
5 of storing and retrieving information using a 
hierarchical filing system. In the following de- 
scription, numerous specific details are set 
forth in order to provide a thorough under- 
standing of the present invention. It will be 

10 obvious, however, to one skilled in the art 
that the present invention may be practiced 
without these specific details. In other in- 
stances, well-known methods have not been 
described in detail in order not to unnecessa- 

15 rily obscure the present invention. 

Referring to Figure 1 , a prior art flat filing 
system 10 is shown having a directory 1 1 
and files 12-15. For ease of understanding, a 
directory is shown pictorially as a folder and a 

20 file is shown as a sheet of paper with a 

folded corner. The pictorial representation ap- 
plies well to an analogy of placing papers into 
folders (files into directories). In the prior art 
system 10. there is present a single directory 

25 11. which contains locator information for files 
12-15. Each of the files 12-15 contain data 
which would be associated with a specific 
body of stored information. In this particular 
example of a prior art system 10, to access 

30 file 15, a serial search is made through direc- 
tory 1 1. until the file address of file 15 is 
located, such sequential search resulting in 
considerable lapse of time when substantial 
numbers of files exist in the directory 11. Al- 

35 though in this hypothetical example, directory 
1 1 maintains pointer addresses to four files 
12-15. directory 11 will continue to store ad- 
dresses of subsequent files in a sequential 
fashion. 

40 Figure 2 illustrates the architecture of the 
Hierarchical Filing System (HFS) of the present 
invention. This particular HFS 16 includes a 
root directory 17 and files 21-24. The HFS 16 
also includes directories 18-20. Each directory 

45 is capable of containing files, as well as other 
directories such as directory 18 containing di- 
rectory 20. Each directory is a branching 
node, allowing for none or a plurality of sub- 
branching nodes. Each directory contains infor- 

50 mation which permits the branching to occur. 
The actual data is stored m the files 21-24. 
Because each file is a termination node, it 
does not need to maintain further branching 
information Instead, each, file stores the actual 

55 data. Therefore, the directories 17-20 maintain 
branching information, while files 21-24 con- 
tain the stored data. 

HFS 16 accesses files 21-24 in a hierarchi- 
cal fashion so that serial search for the files is 

60 not necessary. Assume in the example of Fig- 
ure 2 that access to data stored in file 23 is 
desired. A search of directory 17 reveals that 
two possible paths exist in seeking the ad- 
dress of file 23. One path from directory 17 



to directory 19. The desirable path is to direc- 
tory 18, at which point there are again two 
paths. The desirable path from directory 18 
leads directory to file 23. Although this 

70 example is simplistic because of the miniscule 
number of files shown, one can appreciate the 
file search time saved when a substantially 
large number of files are present. 
Further, as an example, if file 22 had been 

75 chosen, the path from directory 18 would 
have led to directory 20, at which point two 
paths exist from directory 20. The desirable 
path to file 22 from directory 20 then would 
have been chosen. HFS 16, although shown in 

80 a particular form in Figure 2, may have any 
number of levels (branchings) down from the 
root directory 17 as well as any number of 
branches from a particular directory. However, 
it is to be noted that all data is stored in the 

85 represented files 21-24 which are all located 
at the termination nodes of HFS 16. 

In actuality, the cataloging architecture of 
the preferred embodiment contains cataloging 
locator description information in the HFS 16 

90 structure. The catalog entries for files 21-24 
contain pointers which provide locator descrip- 
tions to locate places in storage area where 
actual stored data is kept. 

95 B-TREE 

The HFS of the present invention is imple- 
mented using two B-Tree structures in the 
preferred embodiment, the Catalog B-Tree and 
the File Extents B-Tree. A B-Tree structure is 

100 well-known in the prior art and is described in 
The Art of Computer Programming Volume 3 
(Sorting and Searching); by Donald E. Knuth; 
at Section 6.4; titled "Multiway Trees"; pp 
471-479 (1973). The nodes of a B-Tree con- 

105 tain records, wherein each record is com- 
prised of certain information, either pointers or 
data, and a key associated with that record. 

Referring to Figure 3, a hypothetical B-Tree 
is illustrated. A basic feature of the B-Tree 31 

110 is that data is stored only in leaf nodes 35- 
38. The internal nodes 32-34, also known as 
index nodes, contain pointers to other nodes 
such that these index nodes 32-34 provide an 
index for accessing the data records stored in 

1 15 the leaf nodes 35-38. Each record 39 includes 
a key 40 and an information segment 41. 
Within each node, the records are maintained 
so that their keys are in ascending order. The 
example B-Tree 31 of Figure 3 contains hypo- 

120 thetical keys which have been inserted to 
show the structure of the tree, and the rela- 
tionship between index nodes 32-34 and leaf 
nodes 35-38. Leaf node 35 contains key 
values 48 and 50. The first key of a node is 

125 also represented as a key in its ascending 

node. Therefore key 48. which is the first key 
of leaf node 35, is also represented as a key 
within index node 33. Key 53, which is the 
first key of leaf node 36, is represented as 
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cause key 48 is the first key within index 
node 33, it is again represented as a key 
within index node 32. This pattern is repeated 
for each leaf node 35-38 and each ascending 
5 index node 32-34 for a B-Tree structure. Al- 
though Figure 3 shows only three levels and 
two keys per node, any number of keys per 
node, as well as any number of levels, may 
be chosen for a particular B-Tree structure. B- 
10 Tree 31 of Figure 3 as drawn is a hypotheti- 
cal example for illustration purpose only. 

When a data record is needed, the key of 
the desired record is provided. The search be- 
gins at the root node, which is also an index 
15 node. A search is performed within the node 
until the record with the highest key that is 
not higher than the search key is reached. 
Assume in the hypothetical example of Figure 
3, that data with key 59 is to be selected. 
20 The search commences at the root node 32, 
wherein key 56 is selected because the value 
56 is the highest key that is not greater than 
the search key itself. The pointer of key 56 
selects index node 34, wherein the search 
25 continues within index node 34. Again, key 
56 is chosen because it is the highest key 
that is not greater than the search key itself 
(the next key 63 is greater than the search 
key). The pointer of key 56 in index node 34 
30 selects leaf node 37. Within leaf node 37, 
another search is made to identify search key 
59. When search key 59 is found, its associ- 
ated information (data) is used. 

A particular pointer in an index record leads 
35 to another node one level down in the B-Tree 
31. For example, node 32 to node 34. The 
process continues until a leaf node is reached 
whereupon its records are examined until the 
desired key is found. If the desired key is not 
40 present, then the search stops when a key 
larger than the search key is reached or when 
all the records in the leaf node have been 
examined. The key values may be numeric, 
alphabetical or alphanumeric. 
45 Referring to Figure 4,. it shows the structure 
of any of the nodes of a B-Tree of the pre- 
sent invention. Each node 42 includes a node 
descriptor segment 43, records segment 44, 
record offset segment 46, and can have a 
50 free space segment 45. Each node 42 begins 
with a node descriptor segment 43. NDNRECS 
58 contains the number of records currently in 
the node. NDTYPE 54 indicates the type of 
node, either leaf or index node. NDHEIGHT 57 
55 indicates the height of the node in the tree, 
wherein leaf nodes are chosen as level 1, and 
the index nodes just above them are at level 
2, etc. NDBLINK 52 and NDFLINK 51 are used 
with B-Tree nodes as a way of quickly moving 
60 through the records of the various nodes at a 
given level. For each node, NDBLINK 52 con- 
tains a pointer to the previous node, and 
NDFLINK 51 contains a pointer to the subse- 
quent node at the same level. In Figure 3, 



and NDFLINK for node 36 would point to 
node 37. Therefore, NDBLINK 52 and 
NDFLINK 51 are means of locating adjacent 
nodes without first reversing back up the B- 
70 Tree. 

The records segment 44 contains the B- 
Tree's records, each with its key and pointer 
or data information. In this particular example, 
there are two records 60 and 61. The records 
75 in a node can be of variable length. For this 
reason, offsets to the beginning of each re- 
cord are needed. The records segment begins 
immediately following the node descriptor seg- 
ment 43. The records are followed by a free 
80 space segment 45, which is basically th un- 
used space of the node. Therefore, free space 
segment may not exist in some instances. 
The record offset segment 46 at the end of 
the node contains the offset information for 
85 records 60 and 61. Offset 68 contains offset 
information for record 60 and offset 67 con- 
tains offset information for record 61. Offset 
66 contains the offset necessary to determine 
free space 62. Thus the record segment 44 
90 builds downward into the free space segment 
45, while the record offset segment 46 builds 
upward into the free space segment 45 from 
the opposite end. 
If node 42 is an index node, then each re- 
95 cord 60 and 61 is comprised of a key and 
pointer information. Further, NDFLINK 51 and 
NDBLINK 52 would contain adjacent index 
node linking pointers. If node 42 is a leaf 
node, then each record 60 and 61 is com- 
100 prised of a key and data information. NDFLINK 
51 and NDBLINK 52 would also contain leaf 
node linking pointers. It is also appreciated 
that although a particular format is illustrated 
for node 42, the format may be modified 
105 readily to include other types of information. 
Also, in the preferred embodiment data infor- 
mation in the leaf nodes of the HFS catalog B- 
Tree is used to address locations in memory 
where the actual data is stored. 
110 Referring to Figure 5, a specialized B-Tree 
expansion architecture as implemented in the 
preferred embodiment is shown. A node 70, 
which is equivalent to node 42 of Figure 4, is 
shown having pointers to two lower-level 
115 nodes 71 and 73. which may be index or leaf 
nodes. Although only two nodes 71 and 73 
are shown at the lower level, any number of 
nodes may reside at this lower level. Also in 
this particular hypothetical example, nodes 71 
120 and 73 are only partially filled. 

For a B-tree to maintain its balance, records 
must be kept uniformly spaced within the hier- 
archical structure. An unbalanced tree will re- 
sult when records are not maintained uni- 
125 formly in each node or nodes are heavily 

stacked toward one branch of the B-Tree. The 
preferred embodiment uses a technique of left 
rotate and left splits to provide movement of 
records from one node to another to maintain 

. . i t- ia/i r~ r« ho 
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transferred to another node, the left rotate op- 
eration is used. In this instance, records in 
node 73 are left rotated to its left adjacent 
node 71, as shown by arrow 77. 
5 If another node is needed, such as when 
records in node 73 must be rotated and node 
71 cannot accommodate records from node 
73, a left split operation is used to insert 
node 72 to the left of node 73, between 

10 nodes 71 and 73. In this instance, node 72 is 
inserted to link node 71 and node 73, as 
shown by arrows 78. When node 72 is in- 
serted, appropriate pointer links will be estab- 
lished with its index node 70 as well as adja- 

15 cent link pointers for nodes 71 and 73. Con- 
tinually moving data leftward and inserting 
new data at the right extremities helps keep 
the B-tree balanced. Because the HFS of the 
present invention is structured to have the as- 

20 cending nodes organized in a rightward direc- 
tion, the balancing is maintained even though 
the rotates and splits are made toward the 
left direction. It is appreciated that right splits 
and rotate operations, or balanced insertions 

25 using both right and left operations can be 
used as well. Although the preferred embodi- 
ment uses and attempts to maintain a bal- 
anced B-Tree for search efficiency, most any 
B-Tree structure can be used, including unbal- 

30 anced B-Tree. 

CATALOG TREE 

Referring to Figure 6, a hypothetical catalog 
90 is used to illustrate the implementation of 

35 cataloging of the preferred embodiment. The 
structure 90 has a root directory 91 named 
"Volume". Each directory of the preferred em- 
bodiment is assigned a unique numerical iden- 
tifier known as the directory identifier (DirlD). 

40 The root directory of catalog 90 has DirlD 
value of 2. Root directory 91 has three bran- 
ches comprised of directory 92 and files 93 
and 94. Directory 92 has a name of "Folder" 
and a DirlD value of 29. In turn, directory 92 

45 has two branches comprised of files 95 and 
96 Files 93-96 are named "A". "B". "C" 
and "D", respectively in this example. The 
architecture of the directories and files follows 
the HFS structure as previously explained in 

50 Figure 2. The complete cataloging structure 
90 is stored as data records in various leaf 
nodes of the B-Tree of Figures 3 and 4 
known as the catalog B-Tree. It is appreciated 
that the cataloging structure 90, although a 

55 tree, is in itself not a B-Tree. The form of 
structure 90 is actually stored in the various 
leaf nodes of a B-Tree. It is to be appreciated 
that the cataloging structure 90 not be con- 
fused with the previous description of the B- 

60 Tree. Catalog 90 and the B-Tree structure are 
two separate and distinct structures. The hier- 
archical structure of the catalog 90 is imple- 
mented as a B-Tree structure and stored as 
data records in leaf nodes of a B-Tree similar 



The hierarchical catalog structure 90 is 
stored in a storage device as shown by a 
memory map 97 of Figure 6. Cataloging map 
97 is comprised of three possible types of 

70 records: directory records 100, file records 
101, and thread records 102. Each record 
100-102 is comprised of a key 103 and infor- 
mation segment 104, as earlier described in 
the description of a leaf node of a B-Tree. 

75 The key 103 of each record is comprised of a 
value 105 and a name 106. The key 103 of a 
directory record, such as that of 91 and 92, 
is comprised of its directory name 106 and its 
parent directory's DirlD value 105. A informa- 

80 tion segment 104 of each directory record, 
such as that of directories 9 1 and 92 is com- 
prised of the directory's DirlD value 107. For 
directory 92, the directory's DirlD has been 
given the value of 29, and has a name of 

85 "Folder". The parent DirlD of record 92 has 
been given the value 2 because directory 92 
is an offspring of directory 91 in the structure 
90. Directory record 91 has a directory DirlD 
value of 2, with a corresponding name of 

90 "Volume". Because directory 91 is a root di- 
rectory, the parent DirlD value has been given 
the value of 1, wherein the value 1 refers to 
the foundation of the filing system itself. 
A file record, such as file records 93-96, is 

95 also comprised of a key 1 13 and an informa- 
tion segment 114, wherein key 113 is also 
comprised of a parent DirlD value and a name. 
However, in the information segment 1 14, the 
descriptive location information for the actual 
100 stored file data is maintained as well as a 

unique file number. The information segments 
1 14 of file records 93-96 contain the descrip- 
tive location of the actual stored data informa- 
tion. 

105 File record 94, having a file name of-B, and 
file record 93, having a file name A, both 
have a parent DirlD value of 2. The parent 
DirlD value of 2 signifies that files A and B 
are direct offsprings of directory "Volume" 

110 having a DirlD value of 2. File 95. having a 
name C, and file 96, having a name D, have 
parent DirlD values of 29, which reflect the 
origination of files C and D as offsprings of 
directory 29 labeled "Folder", having a DirlD 

115 value of 29. Therefore, by looking at any file 
or a directory record's key 103, the stored 
information provides the identification of the 
name of that particular record as well as the 
DirlD value of the parent node. 

120 To provide the interconnection of the differ- 
ent branches, a thread record 102 is provided 
for each directory. The key of a thread record 
contains a DirlD value and a null-name, which 
is equivalent to having no name at all. In the 

125 example of Figure 6, thread record 108 pro- 
vides the connection between the directory 
"Folder" and files C and D. In the key 1 1 1 of 
thread record 108, only the directory DirlD 
value of "Folder" is given. In the information 
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of "Folder" 's parent and the directory's name 
"Folder" are given. Therefore, when file C, 
having a parent DirlD 29 attempts to link to 
its immediate parent directory 92, which has a 
5 DirlD of 29, the thread record 108 provides 
the name (Folder) of the parent directory 92, 
as well as the parent DirlD value of directory 
92, which is equal to 2. 
Equivalently thread record 109 provides the 

10 name (Volume) of directory 91 as well as its 
parent directory DirlD value for the three off- 
springs 92-94 of directory 91. By having di- 
rectory records 91-92, file records 93-96, 
along with thread records 108-109 for each 

15 directory, the cataloging structure 90 is inter- 
connected into a HFS, wherein the descriptive 
location information for the actual stored data 
is stored in file records 91-92 as shown in 
the structure 97 of Figure 6. 

20 By implementing the cataloging structure 90 
using a B-Tree structure, the hierarchical confi- 
guration of structure 90 is easily stored in the 
leaf nodes of a B-Tree of the earlier descrip- 
tion. For example, when file C is to be ac- 

25 cessed by a computer, the system will imple- 
ment a B-Tree search. Referring to the catalog 
example 90 of Figure 6. when file with name 
C is to be found, the search path must be 
specified for this search. This can be given in 

30 terms of a sequence of the names of all direc- 
tories on the path from the root to the said 
file, thus "Volume", followed by "Folder", 
and finally "C". The search begins by finding 
the directory record in the Catalog B-Tree that 

35 corresponds to "Volume", its name is "Vol- 
ume" and since it is the root, its parent DirlD 
value is 1. The catalog B-Tree is searched for 
a directory record with key < 1 > Volume; 
thus, directory record 91 is found. Its informa- 

40 tion segment then provides the DirlD value 2 
of this directory. Now a search is made 
through the B-Tree for the record with key 
<2> Folder which leads to the directory re- 
cord 92, whose information segment provides 

45 this directory's DirlD value of 29. Thus now a 
search of the B-Tree is made to find the data 
record with key <29>C. This immediately 
leads the search to the file record 95, whose 
information segment contains the information 

50 about the physical location of the data con- 
tained in the desired file. 

It will be appreciated that the specification 
of the file of the above example could start 
with the DirlD value of any directory on the 

55 path from the root to the desired file, and 
would then consist of this DirlD value and the 
sequence of names of the directories on the 
balance of the path from that directory to the 
desired file. The search mechanism followed is 

60 an obvious variant of the one indicated above. 
Although cataloging structure 90 is a simpli- 
fied structure and Figure 6 only shows the 
presence of a single structure having a single 
root directory 91, a cataloging structure may 



ment uses one HFS cataloging structure per 
memory device, such as a disk. However, 
such a disk can be partitioned and an HFS 
catalog assigned to each such partition. 

70 The catalog records of structure 97 of Fig- 
ure 6 are stored as the data records in the 
leaf nodes 42 of Figure 4 of a catalog B-Tree. 
These records are inserted and maintained in 
the catalog B-Tree in ascending alphanumeric 

75 order. Thus, if the leaf nodes of the B-Tree 
are traversed from left to right, the data re- 
cords will be encountered in the order shown 
in structure 97 of Figure 6. This order main- 
tains the records in ascending order first by 

80 the DirlD value part of the key. Then, among 
records with the same DirlD value in their 
keys, the order is alphabetical on the name 
part of the key. 
It is also appreciated that other pertinent 

85 information may be stored in the various re- 
cords besides what has been disclosed in Fig- 
ure 6. For example, directory and file records 
of the present invention maintain flags, date 
and time of creation of the directory or the 

90 file, as well as the date and time of last modi- 
fication. Also, file records include such items 
as flags for locking the file, values to set logi- 
cal and physical end of files, and size of the 
file. 

95 

FILE EXTENTS TREE 

As already noted, the catalog B-Tree's file 
record of a particular file contains information 
about the locations in the memory device 

100 where the file's data is stored. The memory 
device is considered to be a sequentially num- 
bered collection of blocks. A series of contig- 
uous memory blocks is called an extent. Ide- 
ally, a file would be stored in a single extent 

105 having a contiguous memory allocation space. 
However, due to the size of certain files, as 
well as subsequent additions, deletions and 
modifications to existing files, files are usually 
stored in more than one allocated area of the 

1 10 memory. Except in the case of preallocated or 
small files, the contents of a particular file are 
usually stored in more than one extent, sepa- 
rated into non-contiguous sections on a vol- 
ume. Each file extent can be identified by an 

115 extent descriptor. Thus, the complete location 
information of a particular file is a sequential 
extents list consisting of the extent descrip- 
tors of the various extents containing the file's 
data. 

120 The file extents list of the present invention 
is organized also as a B-Tree, known as the 
File Extents B-Tree, and records the volume 
location and size of the various extents that 
comprise the files. Although most any mem- 

125 ory allocation system can employ the file ex- 
tents record of the present invention, a spe- 
cific memory allocation system is described to 
illustrate the file extents record of the pre- 
ferred embodiment. 



6 



GB2 196 764A 



6 



which is a portion of a memory device, such 
as a hard disk, is shown. Volume 120 is seg- 
mented into a number of logical blocks 126. 
Typically, each logical block 126 is comprised 
5 of a predetermined fixed number of bytes, 
such as 512 bytes for the preferred embodi- 
ment. A fixed number of logical blocks start- 
ing at block 0 and ending at block n is res- 
erved for volume information. The balance of 

10 the memory device starting at block n + l is 
available for data storage and this storage 
area is separated into allocation units, wherein 
each allocation unit is comprised of one or 
more contiguous logical blocks. 

15 Volume 120 includes four areas 121-124. 
System start-up area 121 contains certain 
configurable system parameters which are 
well-known in operating a disk or other mem- 
ory devices. Volume information area 122 

20 contains information regarding the housekeep- 
ing parameters of the volume, such as number 
and size of each allocation unit. Volume bit 
map 123 maintains record of each allocation 
unit on the volume 120 and uses a bit map to 

25 designate use or non-use of each allocation 
unit. 

Commencing at block n+l, a file content 
area 124 extends to the end of the Volume 
120. File content area 124 is separated into a 

30 number of allocation units, wherein each allo- 
cation unit is comprised of a fixed number of 
logical blocks. While the bit map 123 main- 
tains volume space management, it does not 
provide file mapping. The file mapping func- 

35 tion is provided by the file extents lists. 

Referring also to Figure 8, a portion of file 
contents area 124 is shown containing infor- 
mation attributed to a file labeled file E. In this 
hypothetical example the entire contents of file 

40 E are separated into seven extents 125-131. 
The first portion of the file is stored in base 
extent 125, the subsequent portions of the 
file are distributed accordingly in extents 2-7 
which are labelled 126-131 File E has seven 

45 extents 125-131 which are not physically con- 
tiguous. To maintain file extents information 
an extent descriptor 140 is used for the base 
extent 125 and each of the subsequent ex- 
tents 126-131 of file E. 

50 Extent descriptor 140 is comprised of a 
starting allocation unit number 141 and num- 
ber of allocation units 142. File E extents list 
135, which is comprised of seven extent des- 
criptors 125a- 13 1a, provides information as 

55 to the address and length of each extent 125- 
131 of file E. For example, the fourth extent 
128, which has a starting allocation address 
of 189 and is only two allocation blocks long, 
has a value of 189 in field 141 and a value of 

60 2 m field 142 of descriptor 128a. 

Extents descriptors of all files in a volume 
are maintained in the present invention in the 
data records contained in the leaf nodes of B- 
Tree such as of Figures 3-5. This tree is 



separate B-Tree from the earlier described ca- 
talog B-Tree. Each data record of this extents 
B-Tree consists of a key and an information 
segment as before in the discussion of Figures 
70 3-5. The information segment of a File Extents 
B-Tree data record is comprised of a se- 
quence of extents descriptors of a particular 
file. The maximum number of extents descrip- 
tors in such a record can vary from implemen- 
75 tation to implementation, but in the preferred 
embodiment is set to three. The key of the 
File Extents B-Tree record consists of two 
fields: the file number of the particular file and 
the file relative posistion of the starting block 
80 of the first extent descriptor in that record. 
These extents records are kept in the leaf 
nodes of the Extents B-Tree sorted in ascend- 
ing order first on the file number field and 
then on the file relative position of the starting 
85 block. This allows efficient search through the 
B-Tree for the location information of data at 
a particular file relative position. 

In actuality, the preferred embodiment 
stores three extents descriptors, base plus 
90 two subsequent extents descriptors, the infor- 
mation data segment 114 of the file's catalog 
B-Tree record such as 94 of Figure 6. There- 
fore, in the example of Figure 8, extent des- 
criptors 125a, 126a and 127a are kept in the 
95 information segment of the cataloging struc- 
ture and extents 128a-131a are kept in the 
File Extents B-Tree as shown in Figure 9. Per- 
mitting limited extent information to be kept in 
the data segments of a cataloging structure 

100 permits faster access to data. Only when a 
file contains four extents or more, will it need 
to consult the File Extents B-Tree. It should be 
appreciated that the number of extents which 
are kept in the file's Catalog B-Tree record 

105 without using a File Extents B-Tree is arbitrary 
and can be changed without departing from 
the spirit and scope of the invention. 

Also referring to Figure 9, it shows a ca- 
talog file record 145 and File Extents B-Tree 

1 10 records 143 and 144. As explained in the 
structure of B-Trees of the present invention, 
each record 143 and 144 is comprised of a 
key 148 and 149 and extents list 146 and 
147, respectively. To locate a certain portion 

115 of the data of a particular file, first the Catalog 
B-Tree is searched for the corresponding file 
record. From this file record's information seg- 
ment, the fife number is extracted. Also, the 
first three extent descriptors in the information 

120 segment of the catalog B-Tree file record are 
examined. If the required file data is contained 
within the corresponding extents, then the lo- 
cation information is now readily available. If 
however, the desired file data is located in 

125 extents beyond the three in the catalog's file 
record, then a search is made of the File Ex- 
tents B-Tree using as a search key the file 
number and the computed file relative block 
position of the desired data. This search will 
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ing the desired location information. 

The example using file E is comprised of 22 
blocks and having an arbitrary file number 
equal to 20. The extent descriptors contained 
5 in the catalog file record 145 for file E provide 
the location information for the first 3 extents 
which in turn comprises the first 9 blocks 
(3 + 5 + 1) of the file. The location information 
for the remaining 13 blocks (2 + 3+1+7) of 

10 the file is contained in two data records 143 
and 144 within the File Extents B-Tree. As- 
sume that the desired data is at file relative 
block position 13 within file E. The extent 
descriptors contained in the file's catalog re- 

15 cord are examined first. Since relative block 
13 is greater than the number of blocks lo- 
cated by the extent descriptors in the file's 
catalog record, the File Extent B-Tree is 
searched. The key used for the B-Tree search 

20 for relative block position 13 is <20,13> 
Since the key value of "13" is greater than 
the value "9" of key 148 for the first Files 
Extents B-Tree record 143 for file E and is 
less than the value "15" of key 149 for the 

25 second record 144, the search results with a 
"not found" result but positions to the sec- 
ond B-Tree record 144. By retrieving the pre- 
vious record 143 of key 148, the extent des- 
criptor for relative block 13 is obtained. The 

30 value of "9" for key 148 is derived because 
extents list 146 starts at the tenth relative 
block (allocation unit number 9). The value of 
"15" for key 149 is derived because extents 
list 147 starts at the sixteenth relative block 

35 (allocation unit number 15). 

Implementation 

The HFS of the present invention is imple- 
mented in a computer which is coupled to a 

40 memory device, such as a disk, having an 
ability of storing millions of bits of informa- 
tion, although any storage medium can use 
the HFS. Typically, the HFS of the present 
invention provides the cataloging of various 

45 groupings of data, such as files, which are 
stored on the disk. 

The preferred embodiment implements data 
storage by the use of a cataloging structure 
previously described to catalog data stored on 

50 a large capacity memory device. It also main- 
tains a file extents record of up to three ex- 
tents per file in the catalog. Subsequent ex- 
tent information is stored in a separate file 
extents record. Both the catalog record and 

55 the extents record are maintained using two 
B-Trees of the earlier described B-Tree struc- 
ture. 

The HFS as described in the preferred em- 
bodiment is controlled by a combination of 
60 hardware and software in a computer system. 
The HFS controlling routines are stored in a 
separate storage device than the device used 
for storing the actual data. The preferred em- 
bodiment stores the routines in a read only 



medium may be used. 

Thus, a hierarchical filing system for use 
with a large capacity memory device in de- 
scribed. 

70 

CLAIMS 

1. In a process where information is stored 
on a memory device, a method for preparing 
a computer program for cataloging said infor- 

75 mation, comprising the steps of: 

grouping said information into a plurality of 
files; 

implementing a hierarchical structure which 
has a beginning node, a plurality of termina- 

80 tion nodes, and a plurality of intermediate 
nodes arranged at various subsequent I vels 
from said beginning node and interconnecting 
some of said termination nodes to said begin- 
ning node, such that there is only one inter- 

85 connecting path from said beginning node to 
each of said termination nodes; 

placing location description information for 
each of said files in a predetermined termina- 
tion node, such that each of said termination 

90 nodes includes its associated file location de- 
scription and provides said location description 
for retrieving its associated file; 

assigning a unique value to each of said 
files; 

95 whereby said information for a particular file 
is retrieved by searching for its associated 
value in said hierarchical structure. 

2. The method defined by Claim 1 further 
comprising the steps of: 

100 implementing a B-Tree structure; and 

placing each of said unique values and its 
associated location description information in a 
predetermined leaf node-of said B-Tree. 

3. The method defined by Claim 2 wherein 
105 said placing of said unique values in said leaf 

nodes comprises the step of: 

arranging said unique values in an ascending 
order in said leaf nodes. 

4. The method defined by Claim 3 wherein 
110 said placing of said location description infor- 
mation in said predetermined termination 
nodes further comprises the step of: 

providing for several location description in- 
formation for each file when said information 
115 for a respective file is segmented into a plural- 
ity of physically non-contiguous segments on 
said memory device. 

5. In a process where information is catal- 
oged in a filing system, a method for prepar- 

120 ing a computer program for providing said fil- 
ing system, comprising the steps of: 

ordering a hierarchical nodal structure which 
has a root directory, a plurality of branching 
directories and a plurality of files, wherein 

125 each said file traces a singular path from itself 
to said . root directory such that said singular 
path can transition through said branching di- 
rectories; 

assigning a unique identification value to 



GB2 196 764A 



assigning a unique identification name to 
each of said files; 

placing location description information of 
stored data in its corresponding file wherein 
5 each of said files references a particular 
grouping of said stored data; 

whereby said particular grouping of said 
stored data is cataloged by its corresponding 
name in said hierarchical structure. 
10 6. The method defined by Claim 5 further 
comprising the steps of: 

implementing a B-Tree structure having a 
root index node, a plurality of branching index 
nodes arranged at various subsequent levels 
15 from said root index node and terminating in a 
plurality of leaf nodes; and 

ordering said hierarchical nodal structure in 
said leaf nodes by: 

associating each of said names for each of 
20 said files with one of said values of a corre- 
sponding directory which is immediately above 
in said singular path; 

associating each of said value for each of 
said directory with a value of a corresponding 
25 directory which is immediately above in said 
singular path; 

provides linking of said files and directories, - 
such that each of said files can be accessed 
by accessing any directory along said singular 
30 path; 

7. The method defined by Claim 6 wherein 
said ordering of said hierchical structure in 
said leaf nodes further comprises the step of: 

arranging said values in said leaf nodes of a 
35 B-Tree in an ascending order such that each 
said unique value is associated with its re- 
spective data record comprising of singular 
path linking information, wherein a first value 
in each of said node is also listed in a con- 
40 nected index node of a previous level to form 
an interconnecting sequence from said root in- 
dex node to each of said leaf nodes. 

8. The method defined by Claim 7 wherein 
said placing of said location description infor- 

45 mation for each particular grouping of stored 
data further comprises the step of: 

providing for several location descriptions 
when said grouping is segmented into a plu- 
rality of physically noncontiguous segments on 

50 said memory device. 

9. In a process where information is catal- 
oged in a filing system and retrieved from a 
memory device by using said filing system, a 
method for preparing a computer program for 

55 providing said filing system, comprising the 
steps of: 

ordering a hierarchical cataloging^ structure 
which has a root directory, a plurality of 
branching directories arranged at various sub- 
60 sequent levels from said root directory, 
wherein some of said branching directories 
branch from other of said branching directo- 
ries; said branching directories being intercon- 
nected such that for each of said branching 



itself to said root directory; 

assigning a unique key value to each of said 
directories to distinguish said directories; 

structuring a plurality of files within said hi- 
70 erarchical structure wherein each of said files 
branch from its associated directory; each of 
said files having a unique identifying name is 
associated with a particular grouping of data 
stored in said memory device; 
75 placing location description information for 
each of said particular grouping of data in its 
respective file; 

placing in each directory and file said key 
value of its parent directory such that said 
80 singular path is determined by referencing said 
key value of said parent directory; 

retrieving said particular grouping of data by 
traversing downward through said hierarchical 
structure to said respective file by starting at 
85 any directory along said respective path, 
wherein said file provides said location de- 
scription information; 

whereby search and retrieval of stored data 
is conducted by a systematic and hierarchical 
90 technique. 

10. The method defined by Claim 9 further 
comprising the steps of: 

implementing a B-Tree structure having a 
root index node, a plurality of branching index 
95 nodes arranged at various subsequent levels 
from said root index node and terminating in a 
plurality of leaf nodes; 

organizing said hierarchical cataloging struc- 
ture in said leaf nodes such that said directo- 
100 ries and files are distributed according to their 
parent directory value in an ascending order; 

placing a first value of each node of said B- 
Tree in a connected index node of a previous 
level to form an interconnecting sequence 
105 from said root index node to each of said leaf 
nodes; 

searching for a predetermined key value 
from said hierarchical structure by traversing 
across a level of said B-Tree until a higher 

1 10 value than said predetermined key value is 
found, then traversing down to a next lower 
level by taking a path provided by a next 
lower key value from said higher value, and 
repeating said traversats until one of said leaf 

1 1 5 nodes is reached; 

1 1. The method defined by Claim 10 
wherein placing of said location description in- 
formation for each of said particular grouping 
of stored data further comprises the step of: 

120 providing for several location descriptions 
when said grouping is segmented into a plu- 
rality of physically noncontiguous segments on 
said memory device. 

12. The method defined by Claim 10 further 
125 comprising the step of: 

providing a second B-Tree to maintain loca- 
tion description information when said group- 
ing is segmented into a plurality of physically 
non-contiguous segments on said memory de- 
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13. The method defined by Claim 10 further 
comprising the step of: 

providing a second B-Tree to maintain loca- 
tion description information of excess seg-" 
5 ments when said non-contiguous segments 
exceeds a predetermined number. 

14. In a computer a hierarchical filing sys- 
tem to provide cataloging and retrieval of data 
stored on a storage device, said hierarchical 

10 filing system comprising: 

a memory for storing a program for said 
cataloging and retrieval; 

a processor coupled to said memory and 
said storage device for manipulating said pro- 

15 gram, catalog and retrieve said data; 

said program for ordering a hierarchical ca- 
taloging structure which has a root directory, 
a plurality of branching directories arranged at 
various subsequent levels from said root direc- 

20 tory, wherein some of said branching directo- 
ries branch from other of said branching direc- 
tories; said branching directories being inter- 
connected such that for each of said branch- 
ing directories there is only a singular path 

25 from itself to said root directory; 

said program for assigning a unique key 
value to each of said directories to distinguish 
said directories; 
said program for structuring a plurality of 

30 files within said hierarchical structure wherein 
each of said files branch from its associated 
directory; each of said files associated with a 
particular grouping of data stored in said 
memory device; 

35 said program for placing location description 
information for each of said particular grouping - 
of data in its respective file; 

said program for placing in each directory 
and file said key value of its parent directory 

40 such that said singular path is determined by 
referencing said key value of said parent direc- 
tory; 

said program for retrieving said particular 
grouping of data by traversing downward 
45 through said hierarchical structure to said re- 
spective file, wherein said file provides said 
location description information; 

whereby search and retrieval of stored data 
is conducted by a systematic and hierarchical 
50 technique. 

15. The hierarchical filing system defined in 
Claim 14, wherein said program is stored in a 
read only memory. 

16. In a process where information is 
55 stored on a memory device, a method for 

preparing a computer program for cataloging 
said information subatantially as hereinbefore 
described with reference to the accompanying 
drawings. 

60 17. In a computer a hierarchical filing sys- 
tem to provide cataloging and retrieval of data 
stored on a storage device, said hierarchical 
filing system being substantially as hereinbe- 
fore described with reference to the accom- 

65 panying drawinas. 
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